Goto

Collaborating Authors

 pii type


Understanding Privacy Risks in Code Models Through Training Dynamics: A Causal Approach

arXiv.org Artificial Intelligence

Large language models for code (LLM4Code) have greatly improved developer productivity but also raise privacy concerns due to their reliance on open-source repositories containing abundant personally identifiable information (PII). Prior work shows that commercial models can reproduce sensitive PII, yet existing studies largely treat PII as a single category and overlook the heterogeneous risks among different types. We investigate whether distinct PII types vary in their likelihood of being learned and leaked by LLM4Code, and whether this relationship is causal. Our methodology includes building a dataset with diverse PII types, fine-tuning representative models of different scales, computing training dynamics on real PII data, and formulating a structural causal model to estimate the causal effect of learnability on leakage. Results show that leakage risks differ substantially across PII types and correlate with their training dynamics: easy-to-learn instances such as IP addresses exhibit higher leakage, while harder types such as keys and passwords leak less frequently. Ambiguous types show mixed behaviors. This work provides the first causal evidence that leakage risks are type-dependent and offers guidance for developing type-aware and learnability-aware defenses for LLM4Code.


SA-ADP: Sensitivity-Aware Adaptive Differential Privacy for Large Language Models

arXiv.org Artificial Intelligence

Despite advances in the use of large language models (LLMs) in downstream tasks, their ability to memorize information has raised privacy concerns. Therefore, protecting personally identifiable information (PII) during LLM training remains a fundamental challenge. Conventional methods like Differential Privacy-Stochastic Gradient Descent (DP-SGD) provide robust privacy protection via uniform noising, protecting PII regardless of its distinct sensitivity. This comes at the expense of the model's utility, leading to a trade-off. In this paper, we propose SA-ADP, a sensitivity-aware approach that allocates noise based on the sensitivity of individual PII. We evaluated our method on four datasets (ABCD, CUSTOMERSIM, Wikitext-2, and UNSW-NB15 ). Our results show that SA-ADP achieves results comparable to the baseline (No-DP) and the conventional DP-SGD. This means that our method did not degrade the model's utility while still maintaining strong privacy protection.


Scalable multilingual PII annotation for responsible AI in LLMs

arXiv.org Artificial Intelligence

Abstract--As Large Language Models (LLMs) gain wider adoption, ensuring their reliable handling of Personally Identifiable Information (PII) across diverse regulatory contexts has become essential. This work introduces a scalable multilingual data curation framework designed for high-quality PII annotation across 13 underrepresented locales (Table I), covering approximately 336 locale-specific PII types. Our phased, human-in-the-loop annotation methodology combines linguistic expertise with rigorous quality assurance, leading to substantial improvements in recall and false positive rates from pilot, training, and production phases. Beyond reporting empirical gains, we highlight common annotator challenges in multilingual PII labeling and demonstrate how iterative, analytics-driven pipelines can enhance both annotation quality and downstream model reliability. I. Introduction A. PII Data Protection The surge in user-generated content has led to vast textual corpora containing hidden instances of Personally Identifiable Information (PII) in application forms, support tickets, reviews and social media posts [1]. PII--such as NAME, SSN, and PHONE NUMBER--poses significant privacy risks if not handled correctly. Its compromise can result in identity theft, financial fraud, and unauthorized access to sensitive data [2].


PATCH: Mitigating PII Leakage in Language Models with Privacy-Aware Targeted Circuit PatcHing

arXiv.org Artificial Intelligence

Language models (LMs) may memorize personally identifiable information (PII) from training data, enabling adversaries to extract it during inference. Existing defense mechanisms such as differential privacy (DP) reduce this leakage, but incur large drops in utility. Based on a comprehensive study using circuit discovery to identify the computational circuits responsible PII leakage in LMs, we hypothesize that specific PII leakage circuits in LMs should be responsible for this behavior. Therefore, we propose PATCH (Privacy-Aware Targeted Circuit PatcHing), a novel approach that first identifies and subsequently directly edits PII circuits to reduce leakage. PATCH achieves better privacy-utility trade-off than existing defenses, e.g., reducing recall of PII leakage from LMs by up to 65%. Finally, PATCH can be combined with DP to reduce recall of residual leakage of an LM to as low as 0.01%. Our analysis shows that PII leakage circuits persist even after the application of existing defense mechanisms. In contrast, PATCH can effectively mitigate their impact.


PII-Bench: Evaluating Query-Aware Privacy Protection Systems

arXiv.org Artificial Intelligence

The widespread adoption of Large Language Models (LLMs) has raised significant privacy concerns regarding the exposure of personally identifiable information (PII) in user prompts. To address this challenge, we propose a query-unrelated PII masking strategy and introduce PII-Bench, the first comprehensive evaluation framework for assessing privacy protection systems. PII-Bench comprises 2,842 test samples across 55 fine-grained PII categories, featuring diverse scenarios from single-subject descriptions to complex multi-party interactions. Each sample is carefully crafted with a user query, context description, and standard answer indicating query-relevant PII. Our empirical evaluation reveals that while current models perform adequately in basic PII detection, they show significant limitations in determining PII query relevance. Even state-of-the-art LLMs struggle with this task, particularly in handling complex multi-subject scenarios, indicating substantial room for improvement in achieving intelligent PII masking.


End-to-End triplet loss based fine-tuning for network embedding in effective PII detection

arXiv.org Artificial Intelligence

There are many approaches in mobile data ecosystem that inspect network traffic generated by applications running on user's device to detect personal data exfiltration from the user's device. State-of-the-art methods rely on features extracted from HTTP requests and in this context, machine learning involves training classifiers on these features and making predictions using labelled packet traces. However, most of these methods include external feature selection before model training. Deep learning, on the other hand, typically does not require such techniques, as it can autonomously learn and identify patterns in the data without external feature extraction or selection algorithms. In this article, we propose a novel deep learning based end-to-end learning framework for prediction of exposure of personally identifiable information (PII) in mobile packets. The framework employs a pre-trained large language model (LLM) and an autoencoder to generate embedding of network packets and then uses a triplet-loss based fine-tuning method to train the model, increasing detection effectiveness using two real-world datasets. We compare our proposed detection framework with other state-of-the-art works in detecting PII leaks from user's device.